Use dynamic configuration model in bundles#1098
Merged
Merged
Conversation
This change adds: * A `config.Walk` function to walk a configuration tree * A `config.Path` type to represent a value's path inside a tree * Functions to create a `config.Path` from a string, or convert one to a string
andrewnester
approved these changes
Feb 15, 2024
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Feb 16, 2024
#1217) ## Changes When resolving a value returned by the lookup function, the code would call into `resolveRef` with the key that `resolveKey` was called with. In doing so, it would cache the _new_ ref under that key. We fix this by caching ref resolution only at the top level and relying on lookup caching to avoid duplicate work. This came up while testing #1098. ## Tests Unit test.
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Feb 16, 2024
## Changes This builds on #1098 and uses the `dyn.Value` representation of the bundle configuration to generate the Terraform JSON definition of resources in the bundle. The existing code (in `BundleToTerraform`) was not great and in an effort to slightly improve this, I added a package `tfdyn` that includes dedicated files for each resource type. Every resource type has its own conversion type that takes the `dyn.Value` of the bundle-side resource and converts it into Terraform resources (e.g. a job and optionally its permissions). Because we now use a `dyn.Value` as input, we can represent and emit zero-values that have so far been omitted. For example, setting `num_workers: 0` in your bundle configuration now propagates all the way to the Terraform JSON definition. ## Tests * Unit tests for every converter. I reused the test inputs from `convert_test.go`. * Equivalence tests in every existing test case checks that the resulting JSON is identical. * I manually compared the TF JSON file generated by the CLI from the main branch and from this PR on all of our bundles and bundle examples (internal and external) and found the output doesn't change (with the exception of the odd zero-value being included by the version in this PR).
andrewnester
added a commit
that referenced
this pull request
Feb 20, 2024
CLI: * Add support for UC Volumes to the `databricks fs` commands ([#1209](#1209)). Bundles: * Use dynamic configuration model in bundles ([#1098](#1098)). * Allow use of variables references in primitive non-string fields ([#1219](#1219)). * Add an experimental default-sql template ([#1051](#1051)). * Add an experimental dbt-sql template ([#1059](#1059)). Internal: * Add fork-user to winget release workflow ([#1214](#1214)). * Use `any` as type for data sources and resources in `tf/schema` ([#1216](#1216)). * Avoid infinite recursion when normalizing a recursive type ([#1213](#1213)). * Fix issue where interpolating a new ref would rewrite unrelated fields ([#1217](#1217)). * Use `dyn.Value` as input to generating Terraform JSON ([#1218](#1218)). API Changes: * Changed `databricks lakehouse-monitors update` command with new required argument order. * Added `databricks online-tables` command group. OpenAPI commit cdd76a98a4fca7008572b3a94427566dd286c63b (2024-02-19) Dependency updates: * Bump Terraform provider to v1.36.2 ([#1215](#1215)). * Bump github.com/databricks/databricks-sdk-go from 0.32.0 to 0.33.0 ([#1222](#1222)).
Merged
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Feb 20, 2024
CLI: * Add support for UC Volumes to the `databricks fs` commands ([#1209](#1209)). Bundles: * Use dynamic configuration model in bundles ([#1098](#1098)). * Allow use of variables references in primitive non-string fields ([#1219](#1219)). * Add an experimental default-sql template ([#1051](#1051)). * Add an experimental dbt-sql template ([#1059](#1059)). Internal: * Add fork-user to winget release workflow ([#1214](#1214)). * Use `any` as type for data sources and resources in `tf/schema` ([#1216](#1216)). * Avoid infinite recursion when normalizing a recursive type ([#1213](#1213)). * Fix issue where interpolating a new ref would rewrite unrelated fields ([#1217](#1217)). * Use `dyn.Value` as input to generating Terraform JSON ([#1218](#1218)). API Changes: * Changed `databricks lakehouse-monitors update` command with new required argument order. * Added `databricks online-tables` command group. OpenAPI commit cdd76a98a4fca7008572b3a94427566dd286c63b (2024-02-19) Dependency updates: * Bump Terraform provider to v1.36.2 ([#1215](#1215)). * Bump github.com/databricks/databricks-sdk-go from 0.32.0 to 0.33.0 ([#1222](#1222)).
denik
pushed a commit
that referenced
this pull request
May 20, 2026
## Changes The approach to do this was: 1. Iterate over all libraries in all job tasks 2. Find references to local libraries 3. Store pointer to `compute.Library` in the matching artifact file to signal it should be uploaded This breaks down when introducing #1098 because we can no longer track unexported state across mutators. The approach in this PR performs the path matching twice; once in the matching mutator where we check if each referenced file has an artifacts section, and once during artifact upload to rewrite the library path from a local file reference to an absolute Databricks path. ## Tests Integration tests pass.
denik
pushed a commit
that referenced
this pull request
May 20, 2026
…#1211) ## Changes This feature supports variable lookups in a `dyn.Value` that are present in the type but haven't been initialized with a value. For example: `${bundle.git.origin_url}` is present in the `dyn.Value` only if it was assigned a value. If it wasn't assigned a value it should resolve to the empty string. This normalization option, when set, ensures that all fields that are represented in the specified type are present in the return value. This change is in support of #1098. ## Tests Added unit test.
denik
pushed a commit
that referenced
this pull request
May 20, 2026
#1217) ## Changes When resolving a value returned by the lookup function, the code would call into `resolveRef` with the key that `resolveKey` was called with. In doing so, it would cache the _new_ ref under that key. We fix this by caching ref resolution only at the top level and relying on lookup caching to avoid duplicate work. This came up while testing #1098. ## Tests Unit test.
denik
pushed a commit
that referenced
this pull request
May 20, 2026
## Changes
This is a fundamental change to how we load and process bundle
configuration. We now depend on the configuration being represented as a
`dyn.Value`. This representation is functionally equivalent to Go's
`any` (it is variadic) and allows us to capture metadata associated with
a value, such as where it was defined (e.g. file, line, and column). It
also allows us to represent Go's zero values properly (e.g. empty
string, integer equal to 0, or boolean false).
Using this representation allows us to let the configuration model
deviate from the typed structure we have been relying on so far
(`config.Root`). We need to deviate from these types when using
variables for fields that are not a string themselves. For example,
using `${var.num_workers}` for an integer `workers` field was impossible
until now (though not implemented in this change).
The loader for a `dyn.Value` includes functionality to capture any and
all type mismatches between the user-defined configuration and the
expected types. These mismatches can be surfaced as validation errors in
future PRs.
Given that many mutators expect the typed struct to be the source of
truth, this change converts between the dynamic representation and the
typed representation on mutator entry and exit. Existing mutators can
continue to modify the typed representation and these modifications are
reflected in the dynamic representation (see `MarkMutatorEntry` and
`MarkMutatorExit` in `bundle/config/root.go`).
Required changes included in this change:
* The existing interpolation package is removed in favor of
`libs/dyn/dynvar`.
* Functionality to merge job clusters, job tasks, and pipeline clusters
are now all broken out into their own mutators.
To be implemented later:
* Allow variable references for non-string types.
* Surface diagnostics about the configuration provided by the user in
the validation output.
* Some mutators use a resource's configuration file path to resolve
related relative paths. These depend on `bundle/config/paths.Path` being
set and populated through `ConfigureConfigFilePath`. Instead, they
should interact with the dynamically typed configuration directly. Doing
this also unlocks being able to differentiate different base paths used
within a job (e.g. a task override with a relative path defined in a
directory other than the base job).
## Tests
* Existing unit tests pass (some have been modified to accommodate)
* Integration tests pass
denik
pushed a commit
that referenced
this pull request
May 20, 2026
## Changes This builds on #1098 and uses the `dyn.Value` representation of the bundle configuration to generate the Terraform JSON definition of resources in the bundle. The existing code (in `BundleToTerraform`) was not great and in an effort to slightly improve this, I added a package `tfdyn` that includes dedicated files for each resource type. Every resource type has its own conversion type that takes the `dyn.Value` of the bundle-side resource and converts it into Terraform resources (e.g. a job and optionally its permissions). Because we now use a `dyn.Value` as input, we can represent and emit zero-values that have so far been omitted. For example, setting `num_workers: 0` in your bundle configuration now propagates all the way to the Terraform JSON definition. ## Tests * Unit tests for every converter. I reused the test inputs from `convert_test.go`. * Equivalence tests in every existing test case checks that the resulting JSON is identical. * I manually compared the TF JSON file generated by the CLI from the main branch and from this PR on all of our bundles and bundle examples (internal and external) and found the output doesn't change (with the exception of the odd zero-value being included by the version in this PR).
denik
pushed a commit
that referenced
this pull request
May 20, 2026
CLI: * Add support for UC Volumes to the `databricks fs` commands ([#1209](#1209)). Bundles: * Use dynamic configuration model in bundles ([#1098](#1098)). * Allow use of variables references in primitive non-string fields ([#1219](#1219)). * Add an experimental default-sql template ([#1051](#1051)). * Add an experimental dbt-sql template ([#1059](#1059)). Internal: * Add fork-user to winget release workflow ([#1214](#1214)). * Use `any` as type for data sources and resources in `tf/schema` ([#1216](#1216)). * Avoid infinite recursion when normalizing a recursive type ([#1213](#1213)). * Fix issue where interpolating a new ref would rewrite unrelated fields ([#1217](#1217)). * Use `dyn.Value` as input to generating Terraform JSON ([#1218](#1218)). API Changes: * Changed `databricks lakehouse-monitors update` command with new required argument order. * Added `databricks online-tables` command group. OpenAPI commit cdd76a98a4fca7008572b3a94427566dd286c63b (2024-02-19) Dependency updates: * Bump Terraform provider to v1.36.2 ([#1215](#1215)). * Bump github.com/databricks/databricks-sdk-go from 0.32.0 to 0.33.0 ([#1222](#1222)).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
This is a fundamental change to how we load and process bundle configuration. We now depend on the configuration being represented as a
dyn.Value. This representation is functionally equivalent to Go'sany(it is variadic) and allows us to capture metadata associated with a value, such as where it was defined (e.g. file, line, and column). It also allows us to represent Go's zero values properly (e.g. empty string, integer equal to 0, or boolean false).Using this representation allows us to let the configuration model deviate from the typed structure we have been relying on so far (
config.Root). We need to deviate from these types when using variables for fields that are not a string themselves. For example, using${var.num_workers}for an integerworkersfield was impossible until now (though not implemented in this change).The loader for a
dyn.Valueincludes functionality to capture any and all type mismatches between the user-defined configuration and the expected types. These mismatches can be surfaced as validation errors in future PRs.Given that many mutators expect the typed struct to be the source of truth, this change converts between the dynamic representation and the typed representation on mutator entry and exit. Existing mutators can continue to modify the typed representation and these modifications are reflected in the dynamic representation (see
MarkMutatorEntryandMarkMutatorExitinbundle/config/root.go).Required changes included in this change:
libs/dyn/dynvar.To be implemented later:
bundle/config/paths.Pathbeing set and populated throughConfigureConfigFilePath. Instead, they should interact with the dynamically typed configuration directly. Doing this also unlocks being able to differentiate different base paths used within a job (e.g. a task override with a relative path defined in a directory other than the base job).Tests